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I show that the most common method of quantifying the likelihood that an extreme galaxy cluster 
could exist is biased and can result in false claims of tension with ACDM. This common method 
uses the probability that at least one cluster could exist above the mass and redshift of an observed 
cluster. I demonstrate the existence of the bias using sample cluster populations, describe its origin 
and explain how to remove it. I then suggest potentially more suitable and unbiased measures of 
the rareness of individual clusters. Each different measure will be most sensitive to different possible 
types of new physics. I show how to generalise these measures to quantify the total 'rareness' of a 
set of clusters. It is seen that, when mass uncertainties are marginalised over, there is no tension 
between the standard ACDM cosmological model and the existence of any observed set of clusters. 
As a case study, I apply these rareness measures to sample cluster populations generated using 
, primordial density perturbations with a non-Gaussian spectrum. 
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I. INTRODUCTION 



Observations of galaxy clusters with sufficiently large masses, at sufficiently large redshifts, can provide strong 
constraints on possible deviations from the standard ACDM cosmological model with Gaussian primordial perturba- 
(~| ■ tions Even the observation of a single, sufficiently 'big' and 'early' cluster could rule out the standard model with 
O |_ a high level of confidence 0. Furthermore, a number of clusters have recently been measured to have masses and 
redshifts 0-0] that do, even individually, seem to create tension with the combination of ACDM and Gaussian initial 
perturbations • When attempts have been made to quantitatively measure the degree to which these clusters are 
. collectively "too big, too early" large discrepancies between ACDM and the clusters have been claimed [13, [HI • More 
d ' of these high mass and high redshift clusters have subsequently been detected by the South Pole Telescope (SPT) [l^] 
and many more will be detected in the near future by SPT, the Planck satellite, the Dark Energy Survey and the 
^ XMM Cluster Survey (XCS) 

^ I The formation of extreme galaxy clusters depends on both the size of the primordial density perturbations and 
■ how long these perturbations have had to grow. In the simplest theoretical models, clusters form when Sr, which 
' is the fractional density contrast, 5, averaged over an approximately spherical region of space with radius R, crosses 
a particular threshold. Due to the assumption of statistical homogenity, at a given time, every point in space has 
the same probability distibution for S. Therefore, when S is averaged over larger volumes, the fluctuations in 6fj 
10 . become smaller. This means that regions with large radii are less likely to collapse than regions with small radii. 
' Unfortunately, after collapse it is not possible to directly measure the radius of the region that collapsed to form a 
[ cluster. Nevertheless, according to theory, every cluster forms when Sr crosses one definite threshold. This necessitates 
I < that a cluster with a larger mass came from a larger region of space. Therefore, it is expected that fewer clusters will 
^ ' form at larger masses. Fewer clusters are also expected to form at higher redshifts simply because the initial density 
contrast has had less time to grow. Therefore, the masses and redshifts of observed clusters are good parameters for 
quantifying the rareness of extreme galaxy clusters. As a result, if it could be shown that the most extreme clusters 
, did form "too big, too early" , or even "too little, too late" , it would be possible to conclude that either the primordial 
perturbations do not have a Gaussian spectrum with scale dependence given by a simple power or that the calibration 
between time and redshift is wrong and therefore clusters did not form as early as is inferred from ACDM. 

It would seem that a very natural method to quantify the rareness of an extreme galaxy cluster is to ask "what is 
the probability of observing at least one cluster at this mass and redshift or above (> M > z)?" Once uncertainties 
in mass measurements, selection functions and cosmological parameters are properly marginalised over this question 
can then be converted into a measure of the probability that at least one cluster this rare could be observed. It is this 
resulting measure that has been used to claim that the existence of certain clusters provides tension with ACDM and a 
Gaussian, power law, spectrum of density perturbations. This measure is biased. When rareness is measured without 
bias it is seen that, in ACDM, there is a greater than 50% probability that the rarest currently observed cluster could 
exist somewhere in the fraction of sky observed. Similarly, when the combined rareness of a set of clusters is calculated 
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correctly, it is seen that there is a greater than 20% probability that even the entire set of rarest observed clusters 
could exist. Nevertheless it is still true that quantifying the rareness of extreme galaxy clusters might provide useful 
information about possible new physics as will be explored in Section ITVl 

The core of this paper is structured as follows. In Section I will discuss the > M > z measure of rareness. 
I will discuss why to expect that this measure will give biased results. Following this, in Section III Al I will show 
explicitly that it does lead to biased results when applied to sample cluster populations that were generated assuming 
a ACDM universe. In Section IIII Al I will show how to alter this measure to remove the bias. I will also present 
other well motivated measures to quantify the rareness of individual clusters. Different measures will be more useful 
depending on what specific deviations from ACDM are being tested. In Section IlII Bl I will introduce measures that 
combine the individual rarenesses of a set of clusters to quantify the total rareness of the set as a whole. In both 
Section fill Al and Section fill B I I will apply the introduced measures of rareness to sets of observed galaxy clusters. 
Finally, in Section ITVl I will apply these rareness measures to sample cluster populations generated from non-Gaussian 
primordial perturbations. 



A. The expected number of clusters in a region of the (M, z) plane 



Before moving on to the core of the paper I will briefly outline the methodology used to calculate the expected 
number of clusters in a given region of mass and redshift over a given fraction of the sky, /sky For each measure of 
rareness I will be calculating the probabilities of clusters existing assuming a Poissonian distribution; therefore only 
the expected number is needed. This quantity is given by the following integral 
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where the z and M integrals are over the region of the {M,z) plane being considered, dn/dM is the number density 
of galaxy clusters (the halo mass function) and dV/dz is the volume element, 
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It is customary to reparameterise dn/dM as 
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where pm is the present density of matter in the universe and a is the variance of the density contrast smoothed 
over the comoving scale, R, that corresponds to the mass, Al. The function / (often also called the mass function, 
a convention I will follow in this work) must then either be calculated theoretically or by matching to simulations. 
The form of / depends on the shape of the primordial spectrum. For most of this work I will be assuming that the 
primordial spectrum is Gaussian and will use the Tinker et al. mass function [T^ . 



A 





— a 






[id 




t-i 













(4) 



with A = 0.186(1 + z)-"-'^^, a = 1.47(1 + z)-° °^, b = 2.57(1 + z)-° °" and c = 1.19. I will calculate cr(M,0) using 
the fit to the transfer function presented in flE\ with the modified shape parameter introduced in [16]. The redshift 
dependence of a is then obtained by multiplying a{M, 0) by the linear growth function normalised to be proportional 
to 1/(1 + z) during the era of matter domination. Unless otherwise stated I will also use the WMAP 7 Cosmology 
maximum likelihood (ML) parameters [l7|. Using a different set of cosmological parameters (specifically as) would 
change the exact numbers quoted in parts of this paper but would not alter any of the conclusions drawn from these 
numbers. 



B. Notation used to classify different rareness measures 



In this work I will be introducing a number of different measures to quantify cluster rareness. In each case, the 
quoted measure is to be interpreted as the probability that at least one cluster (or set of clusters) could exist that 
is at least as rare as the measured cluster (or set). For reference, I will quickly summarise the notation convention I 



3 




z 



FIG. 1: An outline of the reason for the bias in R->m>z- Every point on the curve has the same number of clusters expected 
at greater masses and redshifts (i.e. the same RyM>z ~ R*)- The probability that a cluster will exist with RyM>z < R* is 
equal to the probability that a cluster will exist above this entire curve. This probability is necessarily greater than R* . 

have used. For every measure I use R as the base for an unbiased measure and R for a biased measure. For individual 
cluster rareness, each measure corresponds to a prescription for defining unique contours in the (M, z) plane. I have 
chosen to depict the contour used for each measure with subscript text. For example, for the biased version of the 
> M > z measure of rareness I write R:^m>z- I also introduce a number of ways to combine the individual rarenesses 
of a set of clusters. I have chosen to distinguish each method with superscript text. For example, for the method 
used by Hoyle et al. in Ref.[l3| I write Rf . To depict the number of clusters used in a combined measure I put 
this number in subscript immediately before the contour definition. In principle, for a set of clusters, it is possible to 
combine any contour definition with any method for combining the individual cluster rarenesses. 

II. R>M>z AS A MEASURE OF RARENESS 

1. Why there is a bias 

To quantify the rareness of a cluster using the > M > z measure it is necessary to first calculate the number of 
clusters expected above the mass and redshift of the observed cluster. This expected number is then converted into 
the probability that at least one cluster at least as rare as the observed cluster could exist, i?>M>z, by assuming 
Poissonian statistics. If this probability is small, the cluster is deemed to be rare and tension can be claimed with 
ACDM. 

The bias in this measure can be seen in the following way. Using the definition given it is possible to define a 
value of -R>A/>2 at every point in the (M, z) plane. It is then possible to construct contours of equal R-^Myz in the 
plane. The number of clusters expected above each contour of constant Ry,M>z is necessarily larger than the expected 
number of clusters used to calculate R>m>z- This is because, for every point in the plane, R>m>z is calculated 
using only the expected number of clusters at greater masses and greater redshifts. Nevertheless, there will be points 
at lower masses and greater redshifts (and vice versa) that lie on the same RyM>z contour. These points would not 
be included in the integration region used to define RyM>z at any other point on the contour, but are defined to 
be equally rare. Therefore, for a given RyM>z = R*, the probability that some cluster exists with RyM>z < R* is 
necessarily greater than R* itself. This is the bias. This argument is illustrated in Figure [TJ 



2. The magnitude of the bias 

In Refs.jl, Q the idea of a contour of constant R>m>z is used expHcitly. Ref.[l] provides fitting formulae for the 
contours as a function of the exclusion limit required to rule out ACDM. But, for a given exclusion limit of a, if any 
cluster detected above the corresponding contour is to be interpreted as ruling out ACDM with 100q;% confidence, 
then the probability of observing such a cluster cannot be greater than 1 — a. In Figure Hi have tested this. Ref.[i| 
makes a distinction between p, the proportion of WMAP 7 parameter space ruled out, and s, the probability to have 
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FIG. 2: The solid red line is R>m>z, the probability of at least one cluster existing at a greater mass and redshift than each 

point on a contour, M{z). The dashed and dotted black lines are the probability of at least one cluster existing above the 
whole M{z) contour. For each data point in the figure, M{z) is defined so that the value of 1 — R>m>z on M{z) is equal to 
the value on the x-axis. To produce this figure I have used the fitting formula for M{z) provided in Ref.0]. See the main text 
for a discussion of which cosmological parameters were used for each line. 



zero clusters > M > z in a random sample of the same sky fraction. I have always taken s = p and have used this 
as the X-axis of Figure [2] The y-axis gives the probability of at least one cluster existing in a specified region of 
the (M, z) plane, using a specified set of cosmological parameters. The three different lines correspond to particular 
regions and particular cosmological parameters. For each line I have taken /sky = 1. 

The solid red line uses the region of (M, z) defined as > M > z of any arbitrary point on the corresponding contour 
(by definition all points on the same contour return the same result). Ref.0] states that, for a contour with a given p, 
in 100p% of allowed cosmological parameter space the expected number of clusters is less than the expected number 
that would give the required statistical exclusion limit, s. The expected number of clusters depends sensitively and 
monotonically on erg but very weakly on all other cosmological parameters. Therefore, for the solid red line I calculate 
the expected number of clusters > M > z using crg(p) defined by 

P{as < as{p)) ^ p. (5) 

Following Ref.Q, the distribution for erg was taken from combined CMB, SN, BAO and Hq constraints. The red line 
matches well to expectations indicating that I am correctly interpreting Ref0. The dashed and dotted black lines 
correspond to the probability of observing at least one cluster above the whole M{z) contour. The dashed line comes 
from using the WMAP maximum likelihood cosmological values and the dotted line comes from using trg defined by 
eq.(l5]). From the dashed line it is seen that in a true ACDM universe there is a 40% chance of a cluster existing 
with a value of RyM>z that would claim to exclude ACDM at 90% confidence. The dotted line makes it possible 
to estimate the true conservative significance that should be claimed by the observation of a cluster on or above a 
particular M{z) contour. When the dotted line gives a probability of 0.1, the x-axis value is > 0.99. Therefore, if a 
cluster were detected that should result in a claim of exclusion at only ~ 90% significance, the R>m>z statistic would 
erroneously claim this as > 99% significance. 



A. The bias in practice 



In this section I will examine this bias using sample cluster populations, generated assuming a ACDM universe. 
In observations the > M > z contours were not set until the clusters were observed. Therefore, to simulate the 
observations correctly, it is crucial that the contours are set after the sample is generated, at the masses and redshifts 
of the sample clusters and not at the masses and redshifts of the clusters observed in our universe. 



1. Individual clusters 

In this subsection I will calculate R>m>z for the most extreme clusters of sample populations. To generate the 
samples I have first broken the (M, z) plane into bins of equal spacing in Az and A(lnM). I have then calculated the 
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R</ R>M>z 

FIG. 3: Histograms showing the distributions generated for an unbiased rareness measure (left panel) and a biased measure 
(right panel) when they are applied to sample cluster populations. The x-axis on the right panel is the probability that a 
cluster can exist at greater mass and redshift than the rarest cluster in each sample. 



Cluster set 


Mean R"yM>z 


Median RfyM>z 


HIO 


6.1 X 10"^ 


2.6 X 10~® 


Sample (rarest) 


3.0 X 10'* 


1.5 X 10"*^ 


Sample (random) 


0.70 


0.84 



TABLE I: The measure of rareness used in HIO (i.e. Ref . [lOl| ) . applied to the table of clusters in HIO and to sample ACDM 
cluster populations. This measure was used in [13, [lH to claim significant tension with ACDM 

expected number of clusters to exist in each bin, assuming ACDM and using the methodology of Section llAl I then 
generate a Poisson sample in every bin. The occupied bins determine the masses and redshifts of a sample population 
of clusters. The bin spacing is ensured to be fine enough such that no bin is ever occupied by more than one cluster. 

For every bin in the (M, z) plane I also calculate R>m>z- In each sample, I then order the occupied bins with respect 
to R>M>z- The right hand panel of Fig. [3] is a histogram showing the proportion of times that each R^m>z value 
was the smallest R>m>z value in all of the occupied bin^. If R>m>z was an unbiased statistic this would be 
uniforrr0; instead, smaller values of R>m>z are disproportionately favoured. The histogram in the left hand panel is 
an identically produced histogram for one of the unbiased statistics that will be introduced in Section IIIII 

2. Multiple clusters 

In Refs.fiol. [ll| the individual R>m>z values for a set of clusters were combined to form an estimate, R^m>z^ 
the rareness of the set as a whole. This was done simply by multiplying together the R>m>z values for each of the 
individual clusters. These references analysed a particular table of clusters put together in Ref.jl^ (hereafter referred 
to as mo). The table consisted, at the time, of all the clusters that had been detected, with spectroscopic follow up, 
above z — 1. The masses, redshifts and mass errors for the table of clusters are quoted in Table 1 of both references. 
Some clusters in this table were detected by X-ray experiments and some from SZ experiments. The convention used 
in the references was to take /sky = 283 sq. degrees for the X-ray detected clusters and /sky = 178 for the SZ detected 
clusters. To compute R^^y^ ^'-'^ it is necessary to account for mass uncertainties. Both references assumed that 
the masses of the clusters in the table were log-normally distributed. They then sampled from the mass distribution 
for each cluster and, for each sample, calculated the corresponding R^m>z statistic. The final value of RfyM>z for 
the mo clusters is the mean. The result is very small, ~ 10"'^, and resulted in claims of large tension. 



^ For this figure I took /gky = 2500 sq. degrees and imposed M > IO^^A/q (see Section IlII A 4| |. 

^ Ris supposed to quantify the probability that any cluster could exist that is at least as rare as a given cluster. Put another way, for an 
unbiased measure, if R* is the rareness of the rarest cluster then P{R* < R) = R. This is the definition of a uniform distribution. 
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FIG. 4: Histograms of the measure of rareness introduced in HIO applied to the table of clusters in HIO and to sample 
ACDM cluster populations. The distribution for HIO comes from sampling the mass uncertainties (see text). 



I have repeated this calculation in Table|T]with two minor changes. Firstly, I have removed the lightest cluster from 
HIO. This has no effect on the result. Secondly, I have used /sky = 178 + 283 — 461 sq. degrees for all clusters. This 
does affect the quoted numbers but not the conclusions that can be drawn from them. This choice was made because 
it makes it simpler to compare to sample populationt0, which I have also done. To generate the samples I made a cut 
of z > 1 and M > 2 x IO^^Mq to attempt to imitate the nature in which HIO was constructed. Including clusters in 
the sample below either of these thresholds would increase the effects of the bias by increasing the number of clusters 
with small R>m>z- In Table HI the sample (rarest) entry corresponds to the Rii.M>z value calculated using the 13 
occupied bins with the smallest RyM>z- The sample (random) corresponds to the R^^m>z value calculated using 13 
random occupied bins. It is clear that the mere existence of the HIO clusters does not provide tension with ACDM. 
In Fig. m I have plotted histograms for HIO and the sample populations. For the samples, the distribution plotted 
is the proportion of times that a sample had the binned RfyM>z value using the 13 rarest clusters. For HIO, the 
distribution arises through the log- normal sampling of the mass errors. The distributions are very consistent with 
each other. In fact, an actual ACDM universe would be consistent, to ~ 95% confidence, with values of R!^m>z 
small as 10~^°. 

Due to the conservative methods used to construct H1O0 it is unlikely that these clusters are the 13 rarest clusters 
in this 461 sq. degree window. If these clusters were picked at random from the sky they would be anomalously 
rare. However, this also does not reflect the reality. Larger mass clusters are more likely to be seen and the rarest 
clusters are more interesting and thus more likely to have been spectroscopically followed up. If the full HIO selection 
function were properly taken into account, the true result for the samples would therefore lie somewhere between the 
two presented in Table HI This is exactly where the observed value lies. 

In Refs.[l3, [HI the apparent tension with ACDM was used to claim detections of minimum values for the non- 
Gaussianity parameters /nl and ^nl- Clearly, without taking selection functions into account, these clusters present 
no evidence for non-Gaussianity and an unbiased analysis would return results consistent with /nl = .9nl = 0. 



III. UNBIASED MEASURES OF RARENESS 



It is easy to test whether any measure of rareness is unbiased by applying it to sample cluster populations. If it 
returns a uniform distribution, it is unbiased. All of the unbiased measures in this section have been tested in this 
way. An example of this is shown in the left panel of Figure [31 This was calculated for i?</, however any unbiased 
measure of rareness would give an identical result. 



^ It is also arguably the more correct choice. However, treating the SZ and X-ray /^i^y values separately is not clearly wrong because not 

all of the X-ray detected clusters would have been detected by an SZ experiment. 
4 See Ref.O for full details, but an example is the already mentioned exclusion of all observed clusters without spectroscopic follow up. 
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A. Single cluster measures 

1- R>M>z 

The bias in RyM>z can be removed. This is done simply by finding, for each observed value of RyM>z^ the true 
probability of observing this value of i?> m>z or less. This amounts to calculating the expected number of clusters 
above a contour, M(z), of constant R>m>z and then using this expectation value to calculate, RyM>zi the probability 
of at least one cluster at least this rare existing. 

In Rcf. 3 a cluster was detected that was claimed, using R>m>z, to have only a 7% chance of existing in ACDM. 
Using the mass and cosmology quoted in Ref.Q I find R>m>z = 0.06 in good agreement with their result. With a 
mass cut of M > IO^^MqE 

the unbiased value for this cluster is R^m>z = 0.5. Although this cluster certainly is not 
a common cluster, its existence does not provide any tension with ACDM. 

2. R^f 

Despite the fact that R^m>z can be used as an unbiased measure of the rareness of extreme galaxy clusters there 
are other possible unbiased measures that will be more justified in certain circumstances. The calculation of the 
expected number of clusters in a region of the (M, z) plane depends on the mass-function, /, and the volume element. 
If it is assumed that the expansion history of the universe is exactly ACDM but that the primordial perturbations 
are different then / will change, but the volume element will be unchanged. For this situation, it is useful to define 
the contours, M{z), of constant rareness as contours of constant /. The rareness, i?</, is then defined to be the 
probability that at least one cluster lies above the contour of constant /. 

This definition would remove the possibly unwanted effect of a cluster being called 'rare' simply because it was 
found in a region of the (M, z) plane that had a small volume. A region of the (M, z) plane with small / will be seen as 
rarer by i?</ than by RyM>z because regions with larger / but smaller volume will not be included in the expectation 
value used to calculate i?</. If, however, the volume element was different to ACDM predictions i?</ would be less 
sensitive to this than RyM>z- 

3. Ryz 

Alternatively, it could be assumed that the primordial perturbations are well described by a Gaussian power law 
but there are deviations from the ACDM volume element or growth function. In this case, the deviations from ACDM 
will depend only on redshift and not on mass at all. For this possibility, it is useful to introduce a measure of rareness 
that quantifies rareness only by how large z is. Clusters found at large redshift and small mass would be classified as 
rarer by this measure than they would by other measures. 

^. Measures applied to existing extreme clusters 

I have applied these rareness measures to the clusters in HIO and Ref. (hereafter Wll). For the HIO clusters 
I have used the same redshift cut, mass cut and /sky as in Section fll A 21 If lower masses were allowed, then every 
cluster would become less rare because more space would be opened up in the (M, z) plane to find clusters. Even with 
this cut in mass, none of the clusters are exceptionally rare. For the Wll clusters there is no cut in redshift, but a cut 
of M > IO^^Mq and /sky — 2500 sq. degrees. This cut is used to approximate the selection function of Wll. Wll 
saw 26 clusters. With this mass cut, the expected number of clusters observed is > 100. In a more accurate analysis, 
taking the selection function properly into account, the Wll clusters would be rarer than what I quote here. However, 
the same calculation with a higher mass cut, designed to produce only 26 expected clusters, gives qualitatively the 
same results; that is, none of the clusters are exceptionally rare. I have presented results for M > IQ^^Mq because 
this is the lowest mass of any of the clusters observed in Wll. 

Table HIl lists i?</ and R>m>z for the most extreme HIO and Wll clusters. Uncertainties in the masses of the 
clusters have been accounted for with the method described in Section III A 2^1 In this table I have also presented the 



^ This cluster is also in Wll (see Section IlII A 4| |. The mass quoted in Ref.^ is larger than in Wll for reasons explained in Ref.[^. 
^ For the Wll clusters I added the experimental and sytematic errors in quadrature. 
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Cluster 


R<f 


R>M>z 


< / Mass at z = 


> M > z Mass at 2: = 


J2235. 3+2557 (HIO) 


0.58 


0.49 


7.7 X IO^^Mq 


3.3 X lO^'^ Mq 


J0546-5345 (HIO) 


0.76 


0.61 


6.2 X lO^^M© 


2.8 X lO^'^ Mq 


J0910+5422 (HIO) 


0.86 


0.79 


4.5 X IO^^Mq 


1.8 X lO^'^M© 


J2215.9-1738 (HIO) 


0.85 


0.81 


5.2 X lO^^M© 


1.8 X lO^'^M© 


J0102-4915 (WU) 


0.63 


0.61 


7.1 X 10^^ Mq 


3.8 X W^^Mq 


J0615-5746 (WU) 


0.63 


0.70 


7.1 X 10^^ Mq 


3.5 X IO^^Mq 


J0658-5556 (WU) 


0.84 


0.63 


5.2 X IO^'^Mq 


3.6 X IO^^Mq 


J2106-5844 (WU) 


0.73 


0.86 


6.7 X IO^^Mq 


3.0 X lO^'^M© 


J2248-4431 (WU) 


0.84 


0.66 


5.3 X 10^'^Mq 


3.5 X lO^'^M© 


J2344-4243 (WU) 


0.92 


0.88 


5.0 X IO^'^Mq 


2.7 X lO^'^M© 



TABLE II: Two unbiased rareness measures applied to the most extreme clusters in HIO and WU. As well as, each cluster's 

'equivalent mass at redshift zero' according to both rareness measures. This is the mass of a cluster at redshift zero that 
would be judged to be equally as rare as the actual observed cluster. The value of R should be interpreted as the probability 
that at least one cluster could exist that is at least as rare as the observed cluster. 

"equivalent mass at redshift zero" of each cluster. As seen in the previous section, an unbiased measure of rareness 
is constructed from a set of contours in the (M, z) plane defined to be equally rare. The "equivalent mass at redshift 
zero" is then the point at which the M{z) contour for a given cluster and measure crosses zero. That is, another 
cluster observed at redshift zero, with mass M(0) would be judged to be equally rare. For these values I have chosen 
not to marginalise over the mass uncertainties, using only the central value of mass quoted in HIO and Wll. The 
actual equivalent mass at redshift zero depends on the rareness measure. The more intuitively sensible values are 
those for the R^m>z measure. The i?</ measure, while useful when looking for new physics, loses something here 
because it neglects the volume element, which is small at z = 0. While some of these clusters would be very massive 
at z = 0, they would not be beyond precedenlQ. 

Although no single cluster in Table HI] is extremely rare, it might be thought that the total probability of having 
this many clusters with rareness < 0.7 would be very small. However, caution must be taken when estimating the 
rareness of a set of clusters by looking at the rareness of its constituents, especially when mass uncertainties are large. 
For any cluster, large mass uncertainties will push R towards 0.5 because the uncertainties will make the cluster more 
consistent with being both very rare {R ~ 0) and very generic (i? ~ 1). When considering a set of clusters, the mass 
uncertainties need to be taken into account all at once. If the uncertainties are large then the same effect will drive R 
towards 0.5 for the set as well. Thus, two individual i?'s close to 0.5 should not necessarily be interpreted as meaning 
the two clusters are rare as an ensemble. It is possible that they are both just not well measured. Nevertheless, two 
clusters with very small mass uncertainties and rareness measures of ^ 0.5 would be seen to be much rarer as an 
ensemble. The following section will answer this question decisively for HIO and Wll. Neither set is particularly rare. 

B. Multiple cluster measures 

Studying the rareness of individual galaxy clusters is useful and may provide information about new physics. 
However it may be the case that any deviations from ACDM are subtle. Thus, no individual galaxy cluster might be 
rare enough on its own to rule out ACDM to a high degree of significance. Perhaps, if there are a number of clusters, 
which individually provide only small tension with ACDM, they might collectively provide much more tension. To 
quantify this it is necessary to have a measure of the rareness of a set of clusters. 

Quantifying the rareness of a set of clusters risks potential biases similar to the individual rareness biases. Consider 
the situation where only two clusters are being examined, clusters A and B. An intuitive rareness measure, ff" , is 
constructed by asking "what is the probability of at least one cluster existing that is at least as rare as A and at least 
one cluster existing that is at least as rare as B?" . This measure would be biased. If A was made less rare but B was 
made more rare to compensate then R^ can remain the same. Therefore, the true probability of a pair of clusters 



Care should be taken when comparing the masses for HIO and for Wll. Because of the different cuts in mass and redshift, equal values 
of / or RyM>z, for clusters from different sets, would not equate to equal values of -R</ and RyM>z- 
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existing that are as rare as A and B is necessarily greater than . With sets of clusters care must also be taken 
because the rareness of one cluster is not independent of the rareness of other clusters in the set. Nevertheless, these 
obstacles can be overcome. I present below three possible methods. 

1. 

R^ was introduced in Section fll A 21 where it was biased because I used it with R>m>z- It would also be biased 
if instead I used RyMyz- This second bias results from the method of multiplying the individual cluster rareness 
together. Distributions that are uniform between zero and one, will not give a uniform distribution when multiplied 
together. The benefit of this measure is that it is easy to calculate. Any result can be compared to sample cluster 
populations to check expectations and thus remove the bias. 

2. R"-' 

Another simple, multiple cluster measure of rareness is suggested in Ref. 0. This measure asks what is the proba- 
bility that there are at least i clusters at least as rare as the ith rarest cluster in the set. This gives 

■■^ n! 

n 

where Xi is the expected number of clusters above the contour of constant rareness for the i**^ rarest cluster, i?*^ is 
unbiased because it depends on only the one parameter, A^. 

3. R^ 

It could occur that all the Rf^ are small but not small enough to claim a significant deviation from ACDM. 
Alternatively, one of the Rf^ could be very small, but all the rest > 0.1. What is needed is a way to quantify the 
combined significance of all of the Rf^. R^ will do this to a certain degree, but crudely. Most of the eff'ect on R^ 
comes from the one or two most extreme clusters whereas Rf^ could be significant at large i. 

The easiest suggestion is to multiply each of the Rf ^ together. This is fine and would work (it is biased, but the bias 
is easily accounted for). However, the individual Rf^ measures are not independent of each other. If Rf^ is known 
to be very small then this enhances the probability that R2^ will also be small. If the two individual measures are 
naively multiplied together then two sets of clusters that are not equally rare will be classified so. This weakens the 
sensitivity of the measure. 

The solution to this is to use a measure defined as the probability of the intersection of the rareness measures. 

i?f = PiRf < n i?f < i?^^) n • • • n i?f < (7) 

where R^^^y^^^ is the observed value of Rf^. By definition, i?f = R^^ . The i — 2 calculation proceeds as follows. First, 
define Ni to be the number of clusters, in a sample universe, above the contour of constant rareness associated to the 
observed i'^ rarest cluster. Define Nij to be the number of clusters between the i*'^ and j*"^ contours. Then 

R^ = P{N2 > 2 n A^i > 1) 

= P{Ni2 > 0) X P{Ni > 2) + P{Ni2 > 1) X P{Ni = 1) 
= P{Ni > 1) + P{Ni2 = 0) X P{Ni = 1) 

= i?f-Aie-^% (8) 
which is true because N12 = N2 — Ni and P{N > n) = 1 — P{N < n). Similar calculations for R^ and Rf return 



R? 


= 1 - 




R? 


= i?f 


- Ale-^^ 


R^ 


= R? 




R4 


-R§ 


-e-^^ r^ + AiAsAs 



A1A2 X3XI 



(9) 
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i 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


r/M 


0.32 


0.30 


0.33 


0.43 


0.57 


0.72 


0.83 


0.90 


0.94 


0.97 


r/M 


0.22 


0.21 


0.23 


0.28 


0.36 


0.44 


0.53 


0.62 


0.71 


0.78 



TABLE III: The rareness of the i**^ rarest cluster in Wll, using two different methods to define the contours of constant 
rareness in the (M, z) plane. The value of R should be interpreted as the probability that the i**^ rarest existing cluster is at 

least as rare as the i"^ rarest Wll cluster. 



These rareness measures are of course biased because they depend on more than one parameter. The bias can be 
removed by comparing observations to sample cluster populations. It can also be removed by a much more efficient 
calculation. It is actually only the that we need to know the distributions for and not the full (M, z) distribution of 
all clusters in the sky. The distribution for Ai is easily constructed because the distribution of = 1 — exp(— Ai) is 
uniform. It might be hoped that all of the Ai could be constructed from the corresponding, uniform i?,*^ distribution. 
If all that was needed was the distribution of one Ai on its own this would be fine. However, what is actually needed is 
the conditional distribution for A^ given the values of all the other Xj with j ^ i. These distributions can be sampled 
with the following method. First, allocate a value to Ai using a uniform distribution for Rf^ . This assigns the contour 
of the rarest cluster. Next it is necessary to assign the contour of the second rarest cluster. The probability of 
observing at least one cluster between the rarest contour and the second rarest contour is 

i?i^ = l-e-^-, (10) 

where A12 = A2 — Ai. A value can be allocated to A12 using a uniform distribution for . This determines A2 and 
assigns the contour of the second rarest cluster. The same process will work for every Ai by first determining Ai_i 
and then allocating a value to Ai(i_i) = Ai — Ai_i using the uniform distribution, R^j^_i-^ = 1 — exp(— Ai(i_i)). This 
method is much more efficient than simulating an entire cluster population (with > 10** bins) and then finding the i 
rarest clusters. Using the distributions for A^, the bias in is removed by calculating the probability that i?*^ will 
be less than the value observed. 

RC = P{R^ < Rf^^^^). (11) 

For i < 4, I have tested this method for calculating Rf by applying it to sample cluster populations. It is unbiased 
and produces a uniform distribution. 



4- Measures applied to observations 

Although none of the individual clusters in HIO and Wll are significantly rare I speculated in Section UlI A 41 that 
perhaps the sets as a whole might be. It is now possible to check. In Table IIIIl I have listed RfyM>z ^'^'^ ^i<f 
for the Wll clusters. I have used the method for accounting for mass uncertainties described in Section fll A 21 To 
clear up a possible ambiguity, I sample once from each mass distribution in the entire Wll set, I then calculate the 
individual rareness of every cluster, pick the i rarest and then calculate the combined rareness. I do not pre-assign, 
before making each sample, which are the rarest clusters. I use the same mass cut, redshift cut and /sky as I did for 
the Wll entries of Table III 

For the first few values of i, both contour definitions indicate that the rarest Wll clusters are slightly rarer than 
the ACDM average, but not with any statistical significance - there is at least a 20% chance of a set of clusters as 
rare as Wll occurring in ACDM. As i increases, R approaches 1 for both contour definitions. This indicates that, 
for the mass cut I used and for « > 6, the i^^ rarest clusters in Wll are not as rare as they should be. This could be 
an indication of new physics; however it is much, much more likely to be the result of neglecting selection function 
effects. I have already noted that with the mass cut I use, more than 100 clusters are expected to exist. There are 
only 26 clusters in Wll. SPT do not expect to see every cluster that exists above the mass cut I have imposed. 
Therefore, it is probable that the expected but missing rare clusters do exist, but are near this mass threshold, rather 
than absent due to reasons of fundamental physics. The reason that selection function effects are more pronounced 
for larger values of i is that larger mass clusters are both more likely to be seen and more rare. Therefore, it is more 
likely that the rarest cluster in any survey has been seen than, for example, the 5th rarest cluster. 

In Table ITVl I have listed R'i^f, with i < 4, for both Wll and HIO. The method to account for mass uncertainties 
is exactly equivalent to that for Table iHll and the mass cut, redshift cut and /sky is the same for each set as it was for 
Table HIl Table ITVl points to the same conclusion as Table Hill The total significance of the four rarest clusters shows 
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i 


1 


2 


3 


4 


Wll 


0.32 


0.30 


0.30 


0.31 


HIO 


0.29 


0.31 


0.33 


0.36 



TABLE IV: The combined rareness, Rf^f, of the i rarest clusters in HIO and Wll. The value of R should be interpreted as 
the probability that a set of i clusters could exist that is at least as rare as the i rarest clusters in HIO or Wll. 



0.07 
0.06 



■ Wll table of 26 clusters 
□ACDM with WMAP 7 cosmology 




FIG. 5: Histogram of the rareness measure introduced in HIO applied to both the Wll clusters and sample ACDM 
populations. This is equivalent to Figure [l] but for Wll instead of HIO. 



that sets of clusters as rare as HIO and Wll would occur in > 30% of ACDM universes. This could mean that we 
live in a ACDM universe, or simply that the mass errors in both sets are too large to distinguish any new physics. 

Finally, in Figure [5l I have plotted histograms of ^26>m>2 Wll and for sample cluster populations with the 
same mass cut and /sky as I have used for Wll. The median value for Wll is i? = 1.7 x 10~^. This value or less 
occurs for ~ 0.3 of the simulated populations. This is consistent with what was seen using the other rareness measures 
in Tables IIIII and IIVI The equivalent figure for HIO is Figure |H Figure S] seems to indicate the HIO clusters are not 
rare enough for ACDM, whereas Table ITVl indicated that they were slightly too rare. Note however that the i values 
in Table HVl peak at i = 4, whereas Figure H] is for all 13 clusters in HID. As already discussed, it is very unlikely 
that the HIO ensemble really has seen the 13th rarest cluster in its observation window, whereas it is quite possible 
that it has seen the rarest four. An equivalent figure to Figure |4] including only the four rarest clusters in HIO and 
the sample cluster populations returns a result consistent with Table IIVI This effect probably was not seen in the 
comparable Wll analysis because of its more complete survejU. 

To go further in the analysis would require modelling of the selection function for each set of clusters. Given that 
both sets are slightly rarer than a ACDM average and the fact that accounting for the selection function could only 
make the observed clusters appear rarer, this might be an interesting exercise. Although, accounting for cosmological 
uncertainties would weaken the strength of any obtainable result. To make an estimate of the effects of selection 
functions I have attempted re-doing the analyses presented here with a larger mass cut for Wll. If I set the mass 
cut to give an expected number of clusters of 26, the measured rareness of Wll did become more significant, but 
i?'^ did not fall below 0.05. However, if selection functions are to be properly taken into account, any analysis might 
as well go beyond just measuring the rareness of the most extreme clusters and measure instead the total goodness 
of fit of all the clusters observed. In the regions of sky considered by Wll and HIO, many more clusters have been 
observed than are actually analysed. This is because both references made cuts (z > 1 for HIO and significance of 
detection (~ mass) for Wll) that isolate the most extreme clusters. The purpose of measuring rareness is to find a 
way to quantify the likelihood of these most extreme observed clusters existing. It is less useful as a total goodness 
of fit measure for lOO's of clusters. 



Wll has 26 of the ~ 100 clusters expected after my choice of mass cut - HIO has only 13 of the ^ 300 expected. Note that even in 
Table [TVl an increase in _Rp is noticeable for i = 4. 
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i 


/nl = 50 


/nl = 100 


/nl = 500 


/nl = 50 


/nl = 100 


/nl = 500 


1 


0.39 


0.30 


0.022 


0.40 


0.30 


0.023 


2 


0.37 


0.26 


0.0050 


0.38 


0.24 


0.0024 


3 


0.36 


0.24 


0.0012 


0.34 


0.20 


3.0 X 10"* 


4 


0.34 


0.21 


3.1 X 10"'' 


0.32 


0.17 


4.5 X 10~^ 



TABLE V: The probability that the rarest clusters in a non-Gaussian universe could exist in a Gaussian universe. Rf 
quantifies the rareness of the i**^ rarest cluster. 7?p quantifies the combined rareness of all of the i rarest clusters. 



IV. RARENESS APPLIED TO PRIMORDIAL NON-GAUSSIANITY 

As a case study of the use of rareness measures I will apply some of the measures of the previous section to sample 
cluster populations arising from non-Gaussian primordial density perturbations. No attempts have been made to 
consider the effects of selection functions, cosmological uncertainties or mass errors. The benefit of measuring the 
rareness of sample populations is that this is not necessary. Therefore, although the results are instructive of what 
would happen in a statistics limited experiment, they are not representative of any near-future experiments. A more 
accurate analysis of the constraints that observations of a small number of extreme galaxy clusters could provide on 
non-Gaussianity must be left for future work. 

The method used to generate cluster populations from non-Gaussian primordial density perturbations is identical 
to that used for Gaussian primordial density perturbations except for the use of the mass function. I have chosen to 
replace the Tinker et al. mass function used earlier with the prescription for a non-Gaussian mass function outlined in 
|18| . This prescription provides a formula for the ratio between the full non-Gaussian mass function and the Gaussian 
mass function. The prescription then dictates that the formula for the ratio is to be multiplied by a mass function 
trusted in the Gaussian limit. There are many prescriptions in the literature that each give slightly different formulae 
for the ratio. To within the accuracy and range of masses, redshifts and quantity of non-Gaussianity currently tested 
by N-body simulations, they agree with each other. I have chosen this particular prescription because it is expected 
to be more stable (and therefore more accurate) at the larger masses, redshifts and quantities of non-Gaussianity that 
have not yet been tested by simulation. This seems appropriate given that I am examining the most extreme objects 
in sample cluster populations. As in Ref.[l^ and earlier sections of this work, for the Gaussian limit I will use the 
Tinker et al. mass function. This prescription parameterises the non-Gaussianity using the 'local' template which has 
the one parameter, /nl[1^3- 

In Table fVl I have calculated Rf^f and Rf^f- To do this I have used /sky = 1 and made a mass cut of M > IO^^Mq 
and a redshift cut of z > 0.1. I have generated sample cluster populations using the quoted value of /nl and then 
asked how rare each population would be in ACDM. I have not included /nl = in the table because by construction 
it returns 0.5 for every R. Clearly, for large enough /nl, observation of even the rarest four galaxy clusters could 
rule out ACDM. Disappointingly, for smaller /nl this is not the case. Nonetheless, the two measures do behave as 
expected. i?p should quantify the total combined significance of the the first i Rf^ measures. This seems to be what 
is happening in the table. Rf^f drops as i increases, indicating that each individual statistic is further from ACDM. 
Rf^f drops appropriately, although not as rapidly because it 'remembers' the weaker significance of Rj^ for all j < i- 

The final result of this work is Figure [5] which has histograms 

of Rf>M>z for /nl = 100 and ACDM. This result 
matches what is seen in Table |V] for /nl — 100. This shows consistency amongst the various rareness measures using 
different contours of constant rareness and different means to combine individual rareness. 



V. SUMMARY AND DISCUSSION 

In this work I have examined how to accurately quantify the rareness of extreme galaxy clusters. I have shown that 
the most typical method, R>m>z, is biased and makes clusters appear less likely to exist than they actually are. The 
easiest way to understand this is the fact that the probability of a cluster existing with a given value of R>m>z is 
greater than RyM>z itself. Therefore, the event of observing a cluster with a small value of RyM>z is not necessarily 
expected to be an uncommon event. For example, in an actual ACDM universe, there is a ~ 40% probability of a 
cluster existing that would be claimed by R^m>z to rule out ACDM at 90% significance. I have also shown this bias 
explicitly by calculating i?>A/>2 for sample cluster populations. 
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FIG. 6: Histograms of the rareness measure introduced in HIO applied to a population of Gaussian ACDM universes and to a 

population of universes with /nl = 100. 



I have shown how to remove the bias in i?>M>2- I have also presented other unbiased measures of the rareness 
of individual clusters, i?</ and i?>z- This set is not exhaustive of all possible measures. In fact, any prescription 
that defines a unique set of M{z) contours in the (M, z) plane will define a corresponding measure of rareness. For 
example, asking what the probability is that a given cluster is the most massive cluster in a survey window would be 
another possible rareness measure (see ^2d\). I have argued that i?</ will be the measure that is most susceptible to 
deviations from ACDM arising through changes in the primordial sepctrum of density perturbations and that i?>z 
will be most susceptible to deviations in the expansion history and growth function of the primordial perturbations. 

I have presented methods to combine the individual rarenesses of a set of clusters into a combined measure of the 
'rareness' of the set. , first used in Hoyle et al.fl^l, simply multiplies together the individual rarenesses of i clusters. 
This method of combining rarenesses introduces its own bias; however the bias can be straightforwardly accounted for 
by comparing to sample populations. The most signficant flaw in is that it is more sensitive to the rareness of the 
most extreme cluster, than it is to any other cluster in a set. Rf^, first used in Mortonson et al.Q, instead quantifies 
the probability that the i"^ rarest cluster in the universe could be as rare as the i"^ rarest in a given set. Rf' can 
then be calculated at each value of i to look for possible deviations from ACDM. However, it could occur that Rf ^ is 
small for all i, but never small enough to claim significant tension, or Rf^ could be significantly small for some values 
of i, but not for others. Rf , first presented here, quantifies the combined rareness of all of the i rarest clusters. I have 
presented formulae for a biased value of Rf (i.e. Rf) for i < 4 and a computationally efficient method to account 
for this bias (without the need to generate sample cluster populations). Rf includes all the independent information 
regarding rareness that is contained in the masses and redshifts of the i rarest clusters. 

I have applied these individual and combined rareness measures to the sets of clusters found in Ref.[l3| (HIO) and 
Ref.[l^ (Wll). The result is that each of the clusters has a greater than 50% probability of existing in a ACDM 
universe. There is also a greater than 20% probability that a set of clusters as rare as either HIO or Wll could exist 
in ACDM. These constraints are conservative and could be improved by accounting for the selection function of each 
sample and reducing the mass uncertainties. For the most extreme clusters in each sample I have also listed the 
"equivalent mass at redshift zero" , which is the mass of a cluster at z = that would be judged to be equally rare. 
While some of the clusters would be quite large at z = they would not be without precedent. 

As a working example I applied the rareness measures to sample cluster populations generated using a non-Gaussian 
primordial spectrum. For very large non-Gaussianity, even the four rarest clusters could in principle rule out the 
standard cosmological model with high significance. For more realistic levels of non-Gaussianity, a small number of 
the most extreme clusters will not be able to provide significant tension. Note however that my analysis has used the 
non-Gaussianity parameter, /nl- It is expected that the abundance of extreme clusters (and voids) will be relatively 
more susceptible to higher order non-Gaussianity parameters (e.g. ^nl, tnl) than /nl (see for example Refs.[Tll.l2]|). 

In Ref.[4] convenient fitting formulae were provided for M{z) contours as a function of /sky and the exclusion limit 
with which one desires to rule out ACDM (using RyM^z)- Unfortunately, generalising these contours to unbiased 
measures of rareness is not straightforward. The major problem is that the region that it is necessary to integrate 
over in the (M, z) plane to unbiasedly quantify rareness is experiment dependent. If an experiment cannot see clusters 
below a certain mass threshold, a cluster observed by this experiment should not be claimed to be 'likely' because 
equivalently rare clusters might exist below this threshold. A very conservative set of fitting formulae could be 
developed by assuming that an experiment can see everything in the (M, z) plane; however this would probably be 
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so conservative that it would effectively be impotent. Therefore, any accurate fitting formulae would also need to 
depend on observational cuts made in the (M, z) plane. 

When calculating the rareness of observed galaxy clusters my aim was to show that previously claimed sources of 
tension were a result of the use of the biased measure i?>A/>z- Sometimes, because R>m>z was so small, the previous 
works in the literature made very conservative assumptions elsewhere in their analysis. I have also made these 
conservative assumptions. It is possible that a less conservative analysis of the rareness of observed galaxy clusters 
could still show significant tension with ACDM. This is especially true of the set of clusters in HIO (in constructing 
their table, HIO excluded clusters without a spectroscopic redshift measurement, deliberately chose mass estimates 
that had large errors, were conservative in their choice of /sky and neglected any selection function effects). However, 
to avoid accidentally making false claims of tension, any future analysis will need to either account for the bias in 
R>M>zj use one of the naturally unbiased measures presented here, or, choose a new, well motivated, prescription for 
a set of equal rareness contours in the (M, z) plane. 



Note: 



During the preparation of this manuscript, Ref. 22| was posted to the arXiv. In Ref.[22], the gravitational lensing 



masses of 27 high-redshift clusters are presented. Ref. |22 
do this using the biased measures RyM>z and R^^ m-^^- 



claims to find significant tension with ACDM; however they 
With a preliminary analysis, using /sky = 100 sq. degrees, I 



do not find that any individual cluster in Ref. [22| provides tension with ACDM. Also, for the set of clusters in Ref.[22|, 
0.19 (and similar values for i < 4). Tantalisingly, however, I find the following sequence of values for 
0.30, 0.24, 0.13, 0.072, 0.055, 0.061, 0.092, 0.37, 0.50 and 0.58. Unfortunately I have not yet developed the 
machinery to evaluate i?f for i > 4 to test the actual significance of this curious trend. 
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